The SARS-CoV-2 Alpha variant was associated with increased clinical severity of COVID-19 in Scotland: A genomics-based retrospective cohort analysis

Objectives The SARS-CoV-2 Alpha variant was associated with increased transmission relative to other variants present at the time of its emergence and several studies have shown an association between Alpha variant infection and increased hospitalisation and 28-day mortality. However, none have addressed the impact on maximum severity of illness in the general population classified by the level of respiratory support required, or death. We aimed to do this. Methods In this retrospective multi-centre clinical cohort sub-study of the COG-UK consortium, 1475 samples from Scottish hospitalised and community cases collected between 1st November 2020 and 30th January 2021 were sequenced. We matched sequence data to clinical outcomes as the Alpha variant became dominant in Scotland and modelled the association between Alpha variant infection and severe disease using a 4-point scale of maximum severity by 28 days: 1. no respiratory support, 2. supplemental oxygen, 3. ventilation and 4. death. Results Our cumulative generalised linear mixed model analyses found evidence (cumulative odds ratio: 1.40, 95% CI: 1.02, 1.93) of a positive association between increased clinical severity and lineage (Alpha variant versus pre-Alpha variants). Conclusions The Alpha variant was associated with more severe clinical disease in the Scottish population than co-circulating lineages.


Introduction
The Alpha variant of SARS-CoV-2 (Pango lineage B.1.1.7) was first identified in the UK in September 2020 and was subsequently reported in 183 countries [1]. It is defined by 21 genomic mutations or deletions, including 8 characteristic changes within the spike gene (S1 Table) [2]. These are associated with increased ACE-2 receptor binding affinity and innate and adaptive immune evasion [3][4][5][6] compared to preceding lineages. The Alpha variant, the first variant of concern (VOC), was estimated to be 50-100% more transmissible than other lineages present at the time of its emergence [7], explaining the transient dominance of this lineage globally.
The presence of a spike gene deletion (Δ69-70) results in spike-gene target failure (SGTF) in real-time reverse transcriptase polymerase chain reaction (RT-PCR) diagnostic assays and provided a useful proxy for the presence of the Alpha variant for epidemiological analysis during this time period [2]. Four large community analyses showed a positive association between the presence of SGTF and 28-day mortality, with hazard ratios of 1.55 (CI 1.39-1.72), 1.64 (CI 1.32-2.04), 1.67 (CI 1.34-2.09) and 1.73 (CI 1.41-2.13) [8][9][10]. Both SGTF (hazard ratios of 1.52 (CI 1.47-1.57), 1.62 (CI 1.48-1.78)) [11,12] and confirmed Alpha variant infection (hazard ratios of 1.34 (CI 1.07-1.66) and 1.61 (CI 1.28-2.03) [12,13] were associated with an increased risk of hospitalisation in community cases, and a smaller study of hospitalised patients found a greater risk of hypoxia at admission in those with confirmed Alpha variant infection [14]. In contrast, other smaller analyses of hospitalised patients found no association between confirmed Alpha variant infection and increased clinical severity based on a variety of indices [15][16][17]. Limited data are available on the full clinical course of disease with the Alpha variant in relation to co-circulating variants.
Understanding the clinical pattern of disease with new variants of concern is important for several reasons. Firstly, if a variant is more pathogenic than previous variants, this has implications for considering public health restrictions and the optimal functioning of health care systems. Secondly, large numbers of low-and middle-income countries still have less than 50% of their populations having been vaccinated against SARS-CoV-2 [18]. A better understanding of a variant with increased severity is important in modelling the impact of unmitigated infection in these settings. A clear understanding of the behaviour of the Alpha variant, which emerged as a dominant variant in Scotland in the winter of 2020/21, is needed as a baseline to compare the clinical phenotype of variants of concern that have subsequently emerged. Post-Alpha variants, such as Omicron (B.1.1.529), have been shown to be able to evade vaccine-induced immunity and therefore have the potential to spread even in immunised populations [19], so a historical understanding of severity remains important, as it seems unlikely that SARS-CoV-2 infections will be brought under control in the near future.
We aimed to quantify the clinical features and rate of spread of Alpha variant infections in Scotland in a comprehensive national dataset. We used whole genome sequencing data to analyse patient presentations between 1 st November 2020 and 30 th January 2021 as the variant emerged in Scotland and used cumulative generalised additive models to compare 28-day maximum clinical severity for the Alpha variant against co-circulating lineages.

Sample collection and approvals
We included all Scottish COG-UK pillar 1 samples sequenced at the MRC-University of Glasgow Centre for Virus Research (CVR) and the Royal Infirmary of Edinburgh (RIE) between 1st November 2020 and 30th January 2021. These samples derived from both hospitalised patients (59%) and community testing (41%).
Residual nucleic acid extracts derived from the nose-throat swabs of SARS-CoV-2 positive individuals whose diagnostic samples were submitted to the West of Scotland Specialist Virology Centre and Edinburgh Royal Infirmary Virus laboratory and were sequenced following ethical approvals from the West of Scotland Biorepository (16/WS/0207NHS) and the Lothian Biorepository (10/S1402/33). These samples were sequenced without consent following HTA legislation on consent exemption. Use of Scottish anonymised clinical data linked to virus genomic data without informed consent was granted by the Caldicott guardian for each site and by the Scottish Public Benefit and Privacy Panel (PBPP) for Health and Social Care (2122-0130).

Sequencing and bioinformatics
Sequencing was performed as part of the COG-UK consortium using amplicon-based next generation sequencing [20,21]. Sequence alignment, lineage assignment and tree generation were performed using the COG-UK data pipeline (https://github.com/COG-UK/datapipe) and phylogenetic pipeline (https://github.com/cov-ert/phylopipe) with pangolin lineage assignment (https://github.com/cov-lineages/pangolin) [22]. Lineage assignments were performed on 18/03/2021 and phylogenetic analysis was performed using the COG-UK tree generated on 25/02/2021. Estimates of growth rates of major lineages in Scotland were calculated from time-resolved phylogenies for lineages B.1.1.7 (Alpha), B.177 and the sub-clades B.177.5, B.177.8, and another minor B.177 sub-clade (W.4). The estimates were carried out utilising sequences from November 2020 -March 2021 in BEAST (Bayesian Evolutionary Analysis by Sampling Trees) with an exponential growth rate population model, strict molecular clock model and TN93 with four gamma rate distribution categories. Each lineage was randomly subsampled to a maximum of 5 sequences per epiweek (resulting in 52 to 103 sequences per subsample, depending on the lineage), and 10 subsamples replicates analysed per lineage in a joint exponential growth rate population model.

Clinical data
Core demographic data (age, sex, partial postcode) were collected via linkage to electronic patient records at the 7 of 14 scottish health boards (covering 78% of the scottish population) for which we had clinical data access approval, and a full retrospective review of case notes was undertaken. Collected data included residence in a care home; occupation in care home or healthcare setting; admission to hospital; date of admission, discharge and/or death and maximum clinical severity at 28 days sample collection date via a 4-point ordinal scale (1. No respiratory support; 2. Supplemental low flow oxygen; 3. Invasive ventilation, non-invasive ventilation or high-flow nasal canula (IV/NIV/HFNO); 4. Death) as previously used in Volz et al 2020 and Thomson et al 2021 [23,24].
Where available, PCR (Polymerase Chain Reaction) cycle threshold (Ct) and the PCR testing platform were recorded. Nosocomial COVID-19 was defined as a first positive PCR occurring greater than 48 hours following admission to hospital, individuals meeting this criterion were excluded from the study. Discharge status was followed up until 15th April 2021 for the hospital stay analysis. For the co-morbidity sub-analysis, delegated research ethics approval was granted for linkage to National Health Service (NHS) patient data by the Local Privacy and Advisory Committee at NHS Greater Glasgow and Clyde. Cohorts and de-identified linked data were prepared by the West of Scotland Safe Haven at NHS Greater Glasgow and Clyde.

Severity analyses
Four level severity data was analysed using cumulative (per the definition of Bürkner and Vuorre (2019)) generalised additive mixed models (GAMMs) with logit links, specifically, following Volz et al (2020) [23,25]. We analysed three subsets of the data: 1. the full dataset, 2. the dataset excluding care home patients, and 3. exclusively the hospitalised population. Further details regarding these analyses are provided in S1 Appendix.

Ct analysis
Ct value was compared between Alpha variant and pre-Alpha variant infections for those patients where the TaqPath assay (Applied Biosystems) was used. This platform was used exclusively for this analysis because different platforms output systematically different Ct values, and this was the most frequently used in our dataset (n = 154, Alpha = 38, pre-Alpha = 116). We used a generalised additive model with a Gaussian error structure and identity link, and the same covariates used as in the severity analysis to model the Ct value. The model was fitted using the brms (v. 2.14.4) R package [26]. The presented model had no divergent transitions and effective sample sizes of over 200 for all parameters. The intercept of the model was given a t-distribution (location = 20, scale = 10, df = 3) prior, the fixed effect coefficients were given normal (mean = 0, standard deviation = 5) priors, random effects and spline standard deviations were given exponential (mean = 5) priors.

Hospital length of stay analysis
Hospital length of stay was compared for Alpha variant and pre-Alpha variant patients while controlling for age and sex using a Fine and Gray model competing risks regression using the crr function in the cmprsk (v. 2.2-10) R package [27,28]. Nosocomial infections were excluded. In total, this analysis had 521 cases (Alpha = 187, pre-Alpha = 334), of which 4 were censored; 352 patients were discharged from hospital and 165 died.

Emergence of the Alpha variant in Scotland
Between 01/11/2020 and 31/01/2021 1863 samples from individuals tested in pillar 1 facilities in Scotland underwent whole genome sequencing for SARS-CoV-2. Of these, 1475 (79%) could be linked to patient records from participating scottish health boards, and were included in the analysis. The contribution of patients infected with the Alpha variant increased over the course of the study, in line with dissemination across the UK during the study period (Fig 1A  and 1B). At the time of data collection, two peaks of SARS-CoV-2 infection had occurred in the UK: the first (wave 1) in March 2020 [15] and the second in summer 2020 [29], both in association with hundreds of importations following travel to Central Europe [30]. The second peak incorporated two variant waves (waves 2 and 3), initially of B.1.177 ( Fig 1C) and then B.1.1.7/Alpha, radiating from the South of England (Fig 1E). This Alpha variant "takeover" (Fig 1D) corresponded to a five-fold increase in growth rate on an epidemiological scale relative to pre-Alpha lineages (Fig 1F).

Demographics of the clinical cohort
The age of the clinical cohort ranged from 0-105 years, (mean 66.8 years) and was slightly lower in the Alpha group (65.6 years vs. 67.2 years). Overall, 59.1% were female; this preponderance occurred in both subgroups and was higher in the Alpha subgroup (60.4% vs 58.6%). In the full cohort, 3.0% were care-home workers and 10.4% were NHS healthcare workers. 5.5% and 5.8% of those infected with the Alpha variant were care-home and other healthcare workers respectively, compared with 2.2% and 12.0% of those infected with pre-Alpha lineages. 12.9% of those in the Alpha subgroup were care-home residents, compared with 21.7% in pre-Alpha. There was also a difference in the proportion of cases admitted to Intensive Care Units: 6.3% of the Alpha group compared with 3.4% for pre-Alpha. Full details of the demographic data of the cohort can be found in Table 1 and full lineage assignments can be found in S2 Table. Clinical severity analysis Within the clinical severity cohort there were 364 Alpha cases, 1030 B.1.177 cases, and 81 cases due to one of 19 other pre-Alpha lineages (Fig 2), of which 185 Alpha cases (51%) and 440 pre-Alpha cases (38%) received oxygen or died. Consistent with previous research comparing mortality and hospitalisation in SGTF detected by PCR versus absence of SGTF, we found that Alpha variant viruses were associated with more severe disease on average than those from other lineages circulating during the same time period. In the full dataset, we observed a positive association with severity (posterior median cumulative odds ratio: 1.40, 95% CI: 1.02-1.93). In both the subsets, excluding care-home patients or limiting to hospitalised patients only, the mean estimate of the increase in severity of the Alpha variant was smaller, and the variance in the posterior distribution higher likely due to the smaller sample sizes. Given this uncertainty, we cannot determine whether the association of the Alpha variant with severity in the populations corresponding to these subsets is the same as that in the population described by the full dataset, but in all cases, the most likely direction of the effect is positive. Comorbidity data were not available for the full dataset, a sub-analysis on those cases where it could be linked indicated that comorbidities did not substantially affect relative severity estimates (S3 Appendix). Model estimates from severity models from all subsets can be found in S3-S5 Tables.
Bernoulli models looking at sequential severity categories provided weak evidence that the proportional odds assumption of the cumulative logistic model was violated. The odds ratios for the no oxygen versus low flow oxygen, and low flow oxygen versus IV/NIV/HFNC were similar to those estimated under the cumulative model (posterior median odds ratio for no oxygen versus low flow oxygen: 1.77, CI: 1.12-2.80; posterior median odds ratio for low flow oxygen versus IV/NIV/HFNC: 1.26, CI: 0.43-3.67) but with correspondingly higher posterior  was observed for the effect of biological sex, with male sex being associated with more severe outcomes for the first two sequential category models (posterior median odds ratio for no oxygen versus low flow oxygen: 1.32, CI: 0.96-1.80; posterior median odds ratio for low flow oxygen versus IV/NIV/HFNC: 3.10, CI: 1.37-7.08), but withless severe outcomes for the last (posterior odds ratio for IV/NIV/HFNC vs death: 0.62, CI: 0.19-099). Given other research on the topic has consistently identified male sex as a risk factor, this potentially indicates the existence of an important unmeasured confounder only relevant for those requiring invasive ventilation, non-invasive ventilation or high flow nasal cannula oxygen.
Estimates of the severity across the phylogeny are visible in Fig 3; see S2 Appendix for more discussion of this analysis. An analysis including comorbidities for the subset of patients where they were available implied that the inclusion of comorbidities had no impact on the results obtained, see S1 and S3 Appendices.
We also found that the Alpha variant was associated with lower Ct values than infection with pre-Alpha variants (posterior median Ct change: -2.46, 95% CI: -4.22 --0.70) as previously observed [8]. Model estimates for all parameters can be found in S6 Table. We found no evidence that the Alpha variant was associated with longer hospital stays after controlling for age and sex (HR: -0.02; 95% CI: -0.23-0.20; p = 0.89).

Discussion
In this analysis of hospitalised and community patients with Alpha variant and pre-Alpha variant SARS-CoV-2 infection, carried out as the Alpha variant became dominant in Scotland, we provide evidence of increased clinical severity associated with this variant at this time, after adjusting for age, sex, geography and calendar time, as well as testing for sensitivity to number of comorbidities. This was observed across all adult age groups, incorporating the spectrum of COVID-19 disease; from no requirement for supportive care, to supplemental oxygen requirement, the need for invasive or non-invasive ventilation, and to death. This analysis is the first to assess the full clinical severity spectrum of confirmed Alpha variant infection in both community and hospitalised cases in relation to other prevalent lineages circulating during the same time period.
Our study supports the community testing analyses that have reported an increased 28-day mortality associated with SGTF as a proxy for Alpha variant status [8][9][10]. Smaller studies found no effect of lineage on various measures of severity [15][16][17], but these were studies of patients already admitted to hospital and therefore would not pick up the granular detail of increasing disease severity resulting in a need for increasing levels of respiratory support and consequently admission to hospital. Estimated severities for each viral isolate are means and 95% credible intervals of the linear predictor change under infection with that viral genotype from the phylogenetic random effect in the cumulative severity model under a Brownian motion model of evolution. This model constrains genetically identical isolates to have identical effects, so changes should be interpreted across the phylogeny rather than between closely related isolates which necessarily have similar estimated severities. The dataset was downsampled to 100 random samples for this figure to aid readability. Figure was generated using ggtree [31]. https://doi.org/10.1371/journal.pone.0284187.g003 The association between higher viral load, higher transmission and lineage may reflect changes in the biology of the virus; for example, the Alpha variant asparagine (N) to tyrosine (Y) mutation at position 501 of the spike protein receptor binding domain (RBD) was associated with an increase in binding affinity to the human ACE2 receptor [32]. In addition, a deletion at position 69-70 may have increased virus infectivity [33]. The P681H mutation found at the furin cleavage site is associated with more efficient furin cleavage, enhancing cell entry [34]. An alternative explanation for the higher viral loads observed in Alpha variant infection may be that clinical presentation occurs earlier in the illness. Further modelling, animal experiments and studies in healthy volunteers may help to unravel the mechanisms behind this phenomenon.
Our data indicate an association between the Alpha variant and an increased risk of requiring supplemental oxygen and ventilation compared to per-Alpha variants. These two factors are critical determinants of healthcare capacity during a period of high incidence of SARS--CoV-2 infection, and this illustrates the importance for countries, in particular those with less robust health care systems and lower vaccination rates of factoring the requirement for supportive treatment into models of clinical severity and pandemic response decision planning for future SARS-CoV-2 variants of concern. This granular analysis of disease severity based on genomic confirmation of diagnosis should be used as a baseline study for clinical severity analysis of the inevitable future variants of concern.
There are some limitations to our study. Our dataset is drawn from first-line local NHS diagnostic (Pillar 1) testing which over-represents patients presenting for hospital care (59%) while those sampled in the community represented 41% of the dataset. The effect of working in the healthcare sector on severity, driven by systematically different exposures faced by frontline caregivers, could not be adjusted for, due to incompleteness of the data regarding this variable. Further, the analysis dataset employed a non-standardised approach to sampling across the study period as sequencing was carried out both as systematic randomised national surveillance and sampling following outbreaks of interest. Additionally, we did not have information about the vaccination status of the individuals in the study. However, our inability to adjust for this variable is not likely to have had a great impact on our conclusions, as, at the time of the study, the vaccination campaign had recently begun, with only over 75 year olds and high-risk groups eligible. Finally, the cumulative model used and the usage of a single (not location varying) spine for the effect of time in this analysis assumes a homogenous application of therapeutic intervention across the population. Despite these limitations, our results remain consistent with previous work on the mortality of Alpha, and this study provides new information regarding differences in infection severity.
In summary, the Alpha variant was found to be associated with a rapid increase in COVID-19 cases in Scotland in the winter of 2020/21, and an increased risk of severe infection requiring supportive care. This has implications for planning for future variant driven waves of infection, especially in countries with low vaccine uptake or if variants evolve with significant vaccine-escape. Our study has shown the value of the collection of higher resolution patient outcome data linked to genetic sequences when looking for clinically relevant differences between viral variants.
Supporting information S1